449 research outputs found

    How biased are maximum entropy models?

    Get PDF
    Maximum entropy models have become popular statistical models in neuroscience and other areas in biology, and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e. the true entropy of the data can be severely underestimated. Here we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We show that if the data is generated by a distribution that lies in the model class, the bias is equal to the number of parameters divided by twice the number of observations. However, in practice, the true distribution is usually outside the model class, and we show here that this misspecification can lead to much larger bias. We provide a perturbative approximation of the maximally expected bias when the true model is out of model class, and we illustrate our results using numerical simulations of an Ising model; i.e. the second-order maximum entropy distribution on binary data.

    Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks

    Get PDF
    The state-of-the art machine learning approach to training deep neural networks, backpropagation, is implausible for real neural networks: neurons need to know their outgoing weights; training alternates between a forward pass (computation) and a backward pass (learning); and the algorithm needs a large amount of labeled data. Biologically plausible approximations to backpropagation, such as feedback alignment, solve the weight transport problem, but not the other two. Thus, fully biologically plausible learning rules have so far remained elusive. Here we present a family of learning rules that does not suffer from any of these problems. It is motivated by the information bottleneck principle (extended with kernel methods), in which networks learn to squeeze as much information as possible out of the input without sacrificing prediction of the output. The resulting rules have a 3-factor Hebbian structure: they require pre- and post-synaptic firing rates and a global error signal - the third factor - that can be supplied by a neuromodulator. Moreover, they do not require precise labels; instead, they rely on the similarity between the desired outputs. They thus solve all three implausibility issues of backpropagation. Moreover, to obtain good performance on hard problems and retain biologically plausible learning rules, our rules need divisive normalization - a known feature of biological networks. Finally, simulations show that our rule performs nearly as well as backpropagation on image classification tasks.Comment: 20 pages, 2 figure

    Developmental and evolutionary constraints on olfactory circuit selection

    Get PDF
    SignificanceIn this work, we explore the hypothesis that biological neural networks optimize their architecture, through evolution, for learning. We study early olfactory circuits of mammals and insects, which have relatively similar structure but a huge diversity in size. We approximate these circuits as three-layer networks and estimate, analytically, the scaling of the optimal hidden-layer size with input-layer size. We find that both longevity and information in the genome constrain the hidden-layer size, so a range of allometric scalings is possible. However, the experimentally observed allometric scalings in mammals and insects are consistent with biologically plausible values. This analysis should pave the way for a deeper understanding of both biological and artificial networks

    A balanced memory network

    Get PDF
    A fundamental problem in neuroscience is understanding how working memory-the ability to store information at intermediate timescales, like tens of seconds-is implemented in realistic neuronal networks. The most likely candidate mechanism is the attractor network, and a great deal of effort has gone toward investigating it theoretically. Yet, despite almost a quarter century of intense work, attractor networks are not fully understood. In particular, there are still two unanswered questions. First, how is it that attractor networks exhibit irregular firing, as is observed experimentally during working memory tasks? And second, how many memories can be stored under biologically realistic conditions? Here we answer both questions by studying an attractor neural network in which inhibition and excitation balance each other. Using mean-field analysis, we derive a three-variable description of attractor networks. From this description it follows that irregular firing can exist only if the number of neurons involved in a memory is large. The same mean-field analysis also shows that the number of memories that can be stored in a network scales with the number of excitatory connections, a result that has been suggested for simple models but never shown for realistic ones. Both of these predictions are verified using simulations with large networks of spiking neurons

    Sparse connectivity for MAP inference in linear models using sister mitral cells

    Get PDF
    Sensory processing is hard because the variables of interest are encoded in spike trains in a relatively complex way. A major goal in studies of sensory processing is to understand how the brain extracts those variables. Here we revisit a common encoding model in which variables are encoded linearly. Although there are typically more variables than neurons, this problem is still solvable because only a small number of variables appear at any one time (sparse prior). However, previous solutions require all-to-all connectivity, inconsistent with the sparse connectivity seen in the brain. Here we propose an algorithm that provably reaches the MAP (maximum a posteriori) inference solution, but does so using sparse connectivity. Our algorithm is inspired by the circuit of the mouse olfactory bulb, but our approach is general enough to apply to other modalities. In addition, it should be possible to extend it to nonlinear encoding models
    corecore